Answering Top-k Queries Over a Mixture of Attractive and Repulsive Dimensions
نویسندگان
چکیده
In this paper, we formulate a top-k query that compares objects in a database to a user-provided query object on a novel scoring function. The proposed scoring function combines the idea of attractive and repulsive dimensions into a general framework to overcome the weakness of traditional distance or similarity measures. We study the properties of the proposed class of scoring functions and develop efficient and scalable index structures that index the isolines of the function. We demonstrate various scenarios where the query finds application. Empirical evaluation demonstrates a performance gain of one to two orders of magnitude on querying time over existing state-of-the-art top-k techniques. Further, a qualitative analysis is performed on a real dataset to highlight the potential of the proposed query in discovering hidden data characteristics.
منابع مشابه
ارائه روشی پویا جهت پاسخ به پرسوجوهای پیوسته تجمّعی اقتضایی
Data Streams are infinite, fast, time-stamp data elements which are received explosively. Generally, these elements need to be processed in an online, real-time way. So, algorithms to process data streams and answer queries on these streams are mostly one-pass. The execution of such algorithms has some challenges such as memory limitation, scheduling, and accuracy of answers. They will be more ...
متن کاملLinear Sketches for Approximate Aggregate Range Queries
Answering aggregate queries approximately over multidimensional data is an important problem that arises naturally in many applications. An approach to the problem is to maintain a succinct (i.e. O(k) space) representation, called sketch, of the frequency distribution h of the data, and use ĥ for answering queries. Common sketches are constructed via linear mappings of h onto a k–dimensional sp...
متن کاملA Generic Framework for Top-k Pairs and Top-k Objects Queries over Sliding Windows
Top-k pairs and top-k objects queries have received significant attention by the research community. In this paper, we present the first approach to answer a broad class of top-k pairs and top-k objects queries over sliding windows. Our framework handles multiple top-k queries and each query is allowed to use a different scoring function, a different value of k and a different size of the slidi...
متن کاملAnswering Top K Queries Efficiently with Overlap in Sources and Source Paths
Challenges in answering queries over Web-accessible sources are selecting the sources that must be accessed and computing answers efficiently. Both tasks become more difficult when there is overlap among sources and when sources may return answers of varying quality. The objective is to obtain the best answers while minimizing the costs or delay in computing these answers and is similar to solv...
متن کاملEncoding Two-Dimensional Range Top-k Queries
We consider various encodings that support range Top-k queries on a two-dimensional array containing elements from a total order. For an m × n array, with m ≤ n, we first propose an almost optimal encoding for answering one-sided Top-k queries, whose query range is restricted to [1 . . .m][1 . . . a], for 1 ≤ a ≤ n. Next, we propose an encoding for the general Top-k queries that takes m2 lg ((k...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- PVLDB
دوره 5 شماره
صفحات -
تاریخ انتشار 2011